Computer Science and Artificial Intelligence Laboratory Some Properties of Empirical Risk Minimization Over Donsker Classes
نویسندگان
چکیده
We study properties of algorithms which minimize (or almost-minimize) empirical error over a Donsker class of functions. We show that the L2-diameter of the set of almost-minimizers is converging to zero in probability. Therefore, as the number of samples grows, it is becoming unlikely that adding a point (or a number of points) to the training set will result in a large jump (in L2 distance) to a new hypothesis. We also show that under some conditions the expected errors of the almost-minimizers are becoming close with a rate faster than n−1/2. This report describes research done at the Center for Biological & Computational Learning, which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL), as well as in the Dipartimento di Informatica e Scienze dell’Informazione (DISI) at University of Genoa, Italy. This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA972-04-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation (ITR/SYS) Contract No. IIS-0112991, National Science Foundation (ITR) Contract No. IIS-0209289, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218693, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of Health (Conte) Contract No. 1 P20 MH66239-01A1. Additional support was provided by: Central Research Institute of Electric Power Industry (CRIEPI), Daimler-Chrysler AG, Compaq/Digital Equipment Corporation, Eastman Kodak Company, Honda R&D Co., Ltd., Industrial Technology Research Institute (ITRI), Komatsu Ltd., Eugene McDermott Foundation, Merrill-Lynch, NEC Fund, Oxygen, Siemens Corporate Research, Inc., Sony, Sumitomo Metal Industries, and Toyota Motor Corporation. This research has been partially funded by the FIRB Project ASTAA and the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. 2 ANDREA CAPONNETTO AND ALEXANDER RAKHLIN
منابع مشابه
Some Properties of Empirical Risk Minimization Over Donsker Classes
We study properties of algorithms which minimize (or almost-minimize) empirical error over a Donsker class of functions. We show that the L2-diameter of the set of almost-minimizers is converging to zero in probability. Therefore, as the number of samples grows, it is becoming unlikely that adding a point (or a number of points) to the training set will result in a large jump (in L2 distance) t...
متن کاملStability Properties of Empirical Risk Minimization over Donsker Classes
2 ) converges to zero in probability. Hence, even in the case of multiple minimizers of expected error, as n increases it becomes less and less likely that adding a sample (or a number of samples) to the training set will result in a large jump to a new hypothesis. Moreover, under some assumptions on the entropy of the class, along with an assumption of Komlos-Major-Tusnady type, we derive a po...
متن کاملLearning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization
Sayan Mukherjee a,b, Partha Niyogi c, Tomaso Poggio a, and Ryan Rifkin a,d a Center for Biological and Computational Learning, Artificial Intelligence Laboratory, and McGovern Institute, USA E-mail: [email protected]; [email protected] b MIT/Whitehead Institute, Center for Genome Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA E-mail: [email protected] c Department of Computer Scien...
متن کاملBayesian perspective over time
Thomas Bayes, the founder of Bayesian vision, entered the University of Edinburgh in 1719 to study logic and theology. Returning in 1722, he worked with his father in a small church. He also was a mathematician and in 1740 he made a novel discovery which he never published, but his friend Richard Price found it in his notes after his death in 1761, reedited it and published it. But until L...
متن کاملStatistical Learning : CVloo stability is sufficient for generalization and necessary and sufficient for consistency of Empirical Risk Minimization
Solutions of learning problems by Empirical Risk Minimization (ERM) – and almost-ERM when the minimizer does not exist – need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of stability, defined in terms of the property of cross-validation leaveone-out (CVloo) stab...
متن کامل